A Query Analysis of Consumer Health Information Retrieval
نویسندگان
چکیده
The log files of MCW HealthLink web site were analyzed to study users' needs for consumer health information and get a better understanding of the health topics users are searching for, the paths users usually take to find consumer health information and the way to improve search effectiveness. Introduction HealthLink (http://healthlink.mcw.eduf), a consumer health information resource includes several features to help users find articles of interest. First, a search feature allows retrieving those articles that match the keyword or phrase specified by users. Second, the HealthLink documents include Dublin Core metadata to assist users to locate needed health information from HealthLink web pages through external search engines.' Third, most HealthLink articles have been submitted to DMOZ Open Directory (http://dmoz.orgabout.html) to assist people to find HealthLink articles through web directories. To learn more about the users' information needs and make the consumer health information more accessible, an exploratory analysis of search queries submitted to HealthLink was performed. Methodology Data from HealthLink log files (December 2001) were imported into a Microsoft Access database. The records contain a subset of queries directly submitted to HealthLink, and queries submitted through external search engines. The external query submissions were analyzed to determine what are the main paths that users take to access HealthLink and what search engines are most popular in searching for consumer health information. The internal query submissions were analyzed to determine the health topics of interest and the characteristics of HealthLink query strings. To ascertain subject categories of HealthLink searches, queries that were submitted more than 10 times were categorized into a set of health topics, which are based on HealthLink topics and Medical Subject Headings (MeSH) tree structures. The following data were collected and analyzed: search engine use frequency; query frequency distribution; subject categorization of queries; top health topics that received most queries; query construction and pattern; and average number of words per query. Results Most users found HealthLink articles through external search engines rather than searching directly through the HealthLink Web site. The top 10 external search engines used for HealthLink article retrieval in December 2001 were: Google, MSN, Yahoo, AOL, Lycos, Overture, AltaVista, Ask, Excite, and Dogpile. The top 10 highest frequency search queries occurring in December 2001 were: symptoms 671, carotid 389, shingles 327, low back pain 279, kidney infection 255, calcifications 202, neck 144, urinary tract infection 134, back pain 131, microcalcification 129. The most frequently searched health topics were: infections, nutrition/herbs, skin diseases, kidney diseases, wellness, digestive diseases, neurological disorders, musculoskeletal disorders, back problems, heart diseases. Only a few search queries were used repeatedly and most search queries were submitted only once. Queries submitted to HealthLink web site were usually simple and short. On the average, a HealthLink query contained 2.1 words. About 74% of all queries contained one or two words. 37.15% of the queries only had one word. Fewer than 5% of the queries had more than 5 words. ConclusionsThe query analysis reveals users' interests andsearch behaviors. The top lists of search queriesand topics show what users look for and their tophealthconcerns. It seems that the more commonconditions and diseases along withgeneral healthinformation receive the most searches. Accessingconsumer health information could be enhancedusing MeSH or other controlled vocabularies togenerate indexing terms, organize search results,facilitate retrieval ofarticles, and improveretrieval effectiveness based on informationretrieval feedback. References1. Kahn CE.Design and Implementation of anInternet-based health information resource.Computer Methods and Programs inBiomedicine. 2000; 63: 85-97. AMIA 2002 Annual Symposium Proceedings1046
منابع مشابه
QEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches
A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملTeam DA_IICT at Consumer Health Information Search @FIRE2016
Consumer Health Information Search task focuses on retrieval of relevant multiple perspectives for complex health search queries. This task addresses the queries which do not have a single definitive answer but having diverse point of views available. This paper reports the result of standard retrieval methods for identifying the aspects of retrieval result towards the query.
متن کاملPrototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica
Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...
متن کامل